The target task of this study is grounded language understanding for domesticservice robots (DSRs). In particular, we focus on instruction understanding forshort sentences where verbs are missing. This task is of critical importance tobuild communicative DSRs because manipulation is essential for DSRs. Existinginstruction understanding methods usually estimate missing information onlyfrom non-grounded knowledge; therefore, whether the predicted action isphysically executable or not was unclear. In this paper, we present a grounded instruction understanding method toestimate appropriate objects given an instruction and situation. We extend theGenerative Adversarial Nets (GAN) and build a GAN-based classifier using latentrepresentations. To quantitatively evaluate the proposed method, we havedeveloped a data set based on the standard data set used for Visual QA.Experimental results have shown that the proposed method gives the betterresult than baseline methods.
展开▼